Search CORE

57 research outputs found

VLSI Design of a Fast Pipelined 8x8 Discrete Cosine Transform

Author: Rahman Ab Al-Hadi Ab
Zabidi Nurulnajah Mohd
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 12/01/2016
Field of study

This paper presents a Very Large Scale Integrated (VLSI) design and implementation of a fixed-point 8x8 multiplierless Discrete Cosine Transform (DCT) using the ISO/IEC 23002-2 algorithm. The standard DCT algorithm, which is mainly used in image and video compression technology, consists of only adders, subtractors, and shifters, therefore making it efficient for hardware implementation. The VLSI implementation of the algorithm given in this paper further enhances the performance of the transform unit. Furthermore, circuit pipelining has been applied to the base design of the DCT, which significantly improves the performance by reducing the longest path in the non-pipeline design. The DCT has been implemented using semi-custom VLSI design methodology using the TSMC 0.13um process technology. Results show that our DCT designs can run up to around 1.7 Giga pixels/s, which is well above the timing required for real-time ultra-high definition 8K video

Crossref

Universiti Teknologi Malaysia Institutional Repository

Institute of Advanced Engineering and Science

Optimizing Dataflow Programs for Hardware Synthesis

Author: Ab Rahman Ab Al Hadi Bin
Publication venue: Lausanne, EPFL
Publication date: 14/01/2014
Field of study

Infoscience - École polytechnique fédérale de Lausanne

A Low-complexity Complex-valued Activation Function for Fast and Accurate Spectral Domain Convolutional Neural Network

Author: Ab Rahman Ab Al-Hadi
Ayat Sayed Omid
Khalil-Hani Mohamed
Rizvi Shahriyar Masud
Publication venue: IAES Indonesia Section
Publication date: 30/03/2021
Field of study

Conventional Convolutional Neural Networks (CNNs), which are realized in spatial domain, exhibit high computational complexity. This results in high resource utilization and memory usage and makes them unsuitable for implementation in resource and energy-constrained embedded systems. A promising approach for low-complexity and high-speed solution is to apply CNN modeled in the spectral domain. One of the main challenges in this approach is the design of activation functions. Some of the proposed solutions perform activation functions in spatial domain, necessitating multiple and computationally expensive spatial-spectral domain switching. On the other hand, recent work on spectral activation functions resulted in very computationally intensive solutions. This paper proposes a complex-valued activation function for spectral domain CNNs that only transmits input values that have positive-valued real or imaginary component. This activation function is computationally inexpensive in both forward and backward propagation and provides sufficient nonlinearity that ensures high classification accuracy. We apply this complex-valued activation function in a LeNet-5 architecture and achieve an accuracy gain of up to 7% for MNIST and 6% for Fashion MNIST dataset, while providing up to 79% and 85% faster inference times, respectively, over state-of-the-art activation functions for spectral domain

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)

Dataflow Program Analysis and Refactoring Techniques for Design Space Exploration: MPEG-4 AVC/H.264 Decoder Implementation Case Study

Author: Alberti Claudio
Bin Ab Al Hadi
Casale Brunet Simone
Mattavelli Marco
Rahman Ab
Publication venue
Publication date: 02/10/2013
Field of study

This paper presents a methodology to perform design space exploration of complex signal processing systems implemented using the CAL dataflow language. In the course of space exploration, critical path in dataflow programs is first presented, and then analyzed using a new strategy for computational load reduction. These techniques, together with detecting design bottlenecks, point to the most efficient optimization directions in a complex network. Following these analysis, several new refactoring techniques are introduced and applied on the dataflow program in order to obtain feasible design points in the exploration space. For a MPEG-4 AVC/H.264 decoder software and hardware implementation, the multi-dimensional space can be explored effectively for throughput, resource, and frequency, with real-time decoding range from QCIF to HD resolutions

Infoscience - École polytechnique fédérale de Lausanne

Location closeness model for VANETs with integration of 5G

Author: Ab. Rahman Ab Al-Hadi
Junejo Muhammad Haleem
Shaikh Riaz Ahmed
Yusof Kamaludin Mohamad
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Nowadays. 5G is playing a significant role in the efficiency of network security and creating more and faster channels for communication. 5G is evoking industries such as healthcare, education, marketing, transportation, and V2X (Vehicle-to-everything). In addition. 5G considers a new radio access technology that is adding new applications like the Internet of Tilings (IoT). Augmented Reality. Virtual Reality, connected cars, connected people-to-people, smart city, connected homes that are considered using higher bandwidth and low latency. Mainly, this paper is focusing on security challenges faced by the Vehicular ad-hoc network (VANET). VANET faces threats in three different fields: Security, safety, and infotainment, which further have numerous attacks. More precisely, this research conducted an in-depth study and proposed a VANET trust model. Therefore the proposed model deals specifically with the "location closenessb" parameter. Moreover, the trust model integrated with 5G cloud to support greater coverage, effective network density with respect to network infrastructure and IoT as well. Therefore, in this article, an effort has been put forward to implement the model using case studies to validate the trust model based on the "location closeness parameter. The results proved the valid implementation of the model by identifying the trusted communication between the vehicles

Universiti Teknologi Malaysia Institutional Repository

University of East Anglia digital repository

HEVC 2D-DCT architectures comparison for FPGA and ASIC implementations

Author: Ab Rahman Ab Al-Hadi
Awab Ainy Haziyah
Kamisian Izam
Meng Goh Kam
Rusli Mohd Shahrizal
Sheikh Usman Ullah
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/10/2019
Field of study

This paper compares ASIC and FPGA implementations of two commonly used architectures for 2-dimensional discrete cosine transform (DCT), the parallel and folded architectures. The DCT has been designed for sizes 4x4, 8x8, and 16x16, and implemented on Silterra 180nm ASIC and Xilinx Kintex Ultrascale FPGA. The objective is to determine suitable low energy architectures to be used as their characteristics greatly differ in terms of cells usage, placement and routing methods on these platforms. The parallel and folded DCT architectures for all three sizes have been designed using Verilog HDL, including the basic serializer-deserializer input and output. Results show that for large size transform of 16x16, ASIC parallel architecture results in roughly 30% less energy compared to folded architecture. As for FPGAs, folded architecture results in roughly 34% less energy compared to parallel architecture. In terms of overall energy consumption between 180nm ASIC and Xilinx Ultrascale, ASIC implementation results in about 58% less energy compared to the FPGA

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Design space exploration strategies for FPGA implementation of signal processing systems using CAL dataflow program

Author: Bezati Endri
Bin Ab Al Hadi
Casale Brunet Simone
Mattavelli Marco
Rahman Ab
Thavot Richard
Publication venue
Publication date: 14/01/2013
Field of study

This paper presents some strategies for design space exploration of FPGA-based signal processing systems that are specified using the CAL dataflow language. The actor- oriented, high-level of abstraction provided by CAL allows flexible exploration and consequently results in a wide range of feasible design implementations. We have applied and ex- tended the existing techniques for refactoring and pipelining actors and actions by means of critical path analysis, and in- troduced some new buffering techniques based on heuristics. The combinations of these techniques have been applied on the CAL specification of the MPEG-4 video decoder, and synthesized to HDL for evaluation in the design implementa- tion space. Results show that using our configuration for the exploration of 48 design points, a throughput range of roughly 8x has been achieved, for slice, block RAM, frequency, and latency range of 1.3x, 2.5x, 2.5x, and 2.9x respectively

Infoscience - École polytechnique fédérale de Lausanne

DTAPO: Dynamic thermal-aware performance optimization for dark silicon many-core systems

Author: Ab. Rahman Ab. Al Hadi
Al Kubati Ali A. M.
Marsono M. N.
Mohammed Mohammed Sultan
Paraman Norlina
Publication venue: 'MDPI AG'
Publication date: 01/11/2020
Field of study

Future many-core systems need to handle high power density and chip temperature effectively. Some cores in many-core systems need to be turned off or ‘dark’ to manage chip power and thermal density. This phenomenon is also known as the dark silicon problem. This problem prevents many-core systems from utilizing and gaining improved performance from a large number of processing cores. This paper presents a dynamic thermal-aware performance optimization of dark silicon many-core systems (DTaPO) technique for optimizing dark silicon a many-core system performance under temperature constraint. The proposed technique utilizes both task migration and dynamic voltage frequency scaling (DVFS) for optimizing the performance of a many-core system while keeping system temperature in a safe operating limit. Task migration puts hot cores in low-power states and moves tasks to cooler dark cores to aggressively reduce chip temperature while maintaining high overall system performance. To reduce task migration overhead due to cold start, the source core (i.e., active core) keeps its L2 cache content during the initial migration phase. The destination core (i.e., dark core) can access it to reduce the impact of cold start misses. Moreover, the proposed technique limits tasks migration among cores that share the last level cache (LLC). In the case of major thermal violation and no cooler cores being available, DVFS is used to reduce the hot cores temperature gradually by reducing their frequency. Experimental results for different threshold temperatures show that DTaPO can keep the average system temperature below the thermal limit. Affirmatively, the execution time penalty is reduced by up to 18% compared with using only DVFS for all thermal thresholds. Moreover, the average peak temperature is reduced by up to 10.8◦ C. In addition, the experimental results show that DTaPO improves the system’s performance by up to 80% compared to optimal sprinting patterns (OSP) and reduces the temperature by up to 13.6◦ C

Universiti Teknologi Malaysia Institutional Repository

Lightweight Trust Model with Machine Learning scheme for secure privacy in VANET

Author: Ab Rahman Ab Al Hadi
Junejo Muhammad Haleem
Kumar Dileep
Memon Imran
Shaikh Riaz Ahmed
Yusof Kamaludin Mohamad
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

A vehicular ad hoc network (VANETs) is transforming public transport into a safer wireless network, increasing its safety and efficiency. The VANET consists of several nodes which include RSU (Roadside Units), vehicles, traffic signals, and other wireless communication devices that are communicating sensitive information in a network. Nevertheless, security threats are increasing day by day because of dependency on network infrastructure, dynamic nature, and control technologies used in VANET. The security threats could be addressed widely by using machine learning and artificial intelligence on the road transport nodes. In this paper, a comparison of trust and cryptography was presented based on applications and security requirements of VANET

University of East Anglia digital repository